Course: STATW4701 EXPLORATORY DATA ANALYS/VISUA (Spring 2016)
Professor: Dr. Mengqian Lu
Team members: Aditi Hota, Christopher Lam, Robert Carl Minnich, Xiaohui (Eartha) Guo, Xiaowei Zhang, Yijing (Phoebe) Sun
Our story began with an article published in The Washington Post (How financial need sways graduation rates, 2007). This article compares the graduation rates between Pell-eligible students (students who are qualified for federal Pell grants, ie. whose families make less than $50,000 per year) and non-Pell students to evaluate the effectiveness of federal financial aid on student success beyond mere college access. After analysis of data from 1150 schools in the US, it was found that Pell-eligible students in general have lower graduation rates (51%) than non-Pell students (65%). Although results are greatly varied by individual schools, it revealed a fact that some schools didn’t do enough in helping their Pell-eligible students get through college. Our goal here is to futher explore the relationship between federal students aid and graduation rate (by school types) as well as some other factors, for example, default rate on loans, and hopefully this could help people with their college selecting decisions.
Our idea can also be summarized by the following figure:
To start, we wanted to know about our audience - students interested in federal financial aid. We use an aggregated Google search queries dataset (2005 to 2010), which includes the search queries for “FAFSA” -Free application for Federal Student Aid across the country within the five years. We chose data around " Dec, Jan, Feb, March, Apr" as our focus of study, because those are the time periods for students to apply for Financial aid. Since FAFSA is required for every student when applying for federal student aid, we assume it could give us some insights about the time and location factors trending in people’s interests in student aid. The colors indicate different search frequencies within each state given a certain time period. (From Red to Yellow represent “searched most” to “searched least”)
Based on the graphic above, we can see that:
We also did a similar graph for the search of “Pell Grants”:
The results are pretty consistent with the previous one, which raises another interesting point: people who search for “FAFSA” are also very likely to search for “Pell Grants”.
We then moved to our first question: how federal financial aid plays into the choice of school.
To begin, we divided schools into three different types in order to compare the Average Aid Disbursements within each type. And here, we choose three Financial Aid packages to study.
| Type | Example |
|---|---|
| Public School | Ohio State University |
| Private/Non-Profit School | Columbia Unviversity |
| Proprietary School | Kaplan Unviversity |
| Aid Type | Description |
|---|---|
| FSEOG | Federal Supplemental Educational Opportunity Grant: (no repaid) |
| Perkins | Fedueral Perkins Loan: (low interest loan) |
| FWS | Federal Work-Study Program: Provides funds for part time employment |
The following chord diagram can help understand how much money is available per student for different types of federal aid. This is split up by institution type.
From this chord diagram, we have some interesting findings:We then sought to move to the next level: Investigating the success rates of students at different schools receiving various forms of federal aid.
We wanted initially to relate the financial aid types to graduation rate at each school, but we ran into some problems finding the data, since until last year it was not required to be reported:
Source: http://hechingerreport.org/an-unprecedented-look-at-pell-grant-graduation-rates-from-1149-schools/
We can see from the data that there are still many schools not reporting their proper statistics and this is most likely because they are trying to hide information that could hurt their enrollment. We made a chord diagram to actually show the information incompleteness in our dataset. In this Chord Diagram, we basically took out all the schools with unreported information in order to see which types of schools are more likely have missing values. For example: Over all schools have no value in the “Completion Rates” field, most of them are proprietary institutions, and only few belong to private schools. Proprietary schools are also more likely to have unreported information on “Default rate on loans” field than others. The other two fields “Debt- Did Finish” and “Debt- Did Not Finish” don’t have a clear preference - all three types of schools are equally likely to generate missing values in these two fields, suggesting schools may be afraid to reveal the persistence of debt even amongst those who obtain a degree and are thus more likely to get a job. The description of the four fields we focus on are as follows:
Completion Rates: Completion rate for first-time, full-time students at four-year institutions (150% of expected time to completion/6 years).
Default Rates on Loans: Two-year cohort default rate repayment.
Debt for students that did finish: The median debt for students who have completed aid.
Debt for students that did not finish: The median debt for students who have not completed aid.
Although, the dataset is not complete, we still want to have a view of the information we have:
In order to do that, we take out all schools with complete information in the dataset and analyze their performance in the following grouped bar plot:
Finally, we created an interactive map to better assist our users with their choice of schools based on the location and some other key attributes. How to use this map:
Once you click on individual schools, it will give you the information about school’s name, type, Pell Average Disbursement and so on. Take Columbia University for example:
We can see that: The average Pell disbursment in Columbia is about $2549.31, and our graduation rate is about 93% with only 1.9% loan default rate on average.